Incremental Organization for Data Recording and Warehousing
نویسندگان
چکیده
Data warehouses and recording systems typically have a large continuous stream of incoming data, that must be stored in a manner suitable for future access. Access to stored records is usually based on a key. Organizing the data on disk as the data arrives using standard techniques would result in either (a) one or more I/OS to store each incoming record (to keep the data clustered by the key), which is too expensive when data arrival rates are very high, or (b) many I/OS to locate records for a particular customer (if data is stored clustered by arrival order). We study two techniques, inspired by external sorting algorithms, to store data incrementally as it arrives, simultaneously providing good performance for recording and querying. We present concurrency control and recovery schemes for both techniques. We show the benefits of our techniques both analytically and experimentally.
منابع مشابه
The Cubetree Storage Organization
The Relational On-Line Analytical Processing (ROLAP) is emerging as the dominant approach in data warehousing. In order to enhance query performance, the ROLAP approach relies on selecting and materializing in summary tables appropriate subsets of aggregate views which are then engaged in speeding up OLAP queries. However, a straight forward relational storage implementation of materialized ROL...
متن کاملIncremental Maintenance of Object-Oriented Views in a Warehousing Environment
Data warehousing is an approach to data integration in which integrated information is stored in a data warehouse for direct querying and analysis. To provide fast access, a data warehouse stores materialized views defined over data from its data sources. As a result, a data warehouse needs to be maintained to keep its contents consistent with the contents of its data sources. Incremental maint...
متن کاملUsing Schema Transformation Pathways for Incremental View Maintenance
In heterogeneous data warehousing environments, autonomous data sources are integrated into a materialized integrated database. The schemas of the data sources and the integrated database may be expressed in different modelling languages. It is possible for either the data or the schemas of the data sources to be updated. Incremental view maintenance is one of the problems being addressed in da...
متن کاملPerformance Issues in Incremental Warehouse Maintenance
A well-known challenge in data warehousing is the efficient incremental maintenance of warehouse data in the presence of source data updates. In this paper, we identify several critical data representation and algorithmic choices that must be made when developing the machinery of an incrementally maintained data warehouse. For each decision area, we identify various alternatives and evaluate th...
متن کاملIncremental Load in a Data Warehousing Environment
Incremental load is an important factor for successful data warehousing. Lack of standardized incremental refresh methodologies can lead to poor analytical results, which can be unacceptable to an organization’s analytical community. Successful data warehouse implementation depends on consistent metadata as well as incremental data load techniques. If consistent load timestamps are maintained a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997